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Abstract 

The paper identifies and addresses four methodological 
weaknesses common to most previous studios that have used 
LISRKI, confirmatory factor analysis to test for the factorial 
validity and invariance of a single measurinj> instrument. 
Specifically, the paper demonstrates the steps involved in (a) 
conducting sensitivity analyses to determine a statistically 
best-fitting, yet substantively most meaningful baseline model, 
(b) testing for partial measurement invariance, (c) testing for 
the invariance of factor variances and covariances, given 
partial measurement invariance, and (d) testing for the 
invariance of test item and subscale reliabilities. These 
procedures are illustrated with item response data from nornral 
and gifted children in grades 5 and 8, based on the Peiceived 
Competence Scale for Children. 
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Testing tho Fnctorial Validity and Tnvnrianco of a MoasiirinR 
Instrument Using LISRKL Confirmatory Factor Analyses: 
A Reexamination and Application 

In substantive research, an important assumption in 
single-group analyses is that the assessment instrument is 
measuring that which it was design(*d to measure (i.e., it is 
factorially valid), and in multigroup analyses, that it is 
doing so in exactly the same way across independent samples 
(i.e., it is factorially invariant). Traditionally, the factor 
structure of a measuring instrument has been validated by means 
of exploratory factor analysis (EFA), and its invariance tested 
by the comparison of EFA factors across groups using diverse ad 
hoc procedures (for a review, see Marsh & llocevar, 1985; 
Reynolds ft Harding, 1983). At this point in time, however, the 
limitations of RFA are widely known (see e.g., Fornell, 1983; 
Long, 1983; Marsh & llocevar, 1985), as are the issues related 
to tests of factorial invariance based on KFA factors (see 
Alwin & Jackson, 1981 ) . 

A methodologically more sophisticated and statistically 
more powerful technique for such analyses is the confirmatory 
factor analytic (CFA) procedure proposed by Joreskog (1960), 
and now commercially available through the LISREL VI computer 
program (Joreskog ft Sorbom, 1985). The LISRFL CFA approach 
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allows researchers to test a series of hypotheses related to 
(a) the factorial validity of an assessment instrument, and (b) 
the equivalency of its factorial structure and measurements 
across groups. While a number of construct validity studies 
have applied the technique to mul titrait-mul timethod analyses 
of assessment measures (e.g., Bachman & Palmer, 1981; Flamer, 
1983; Forsythe, McGaghie. & Ftiedman, 1986; Marsh & Hocevar, 
198A; Watkins A Hattic, 1981), few have used it to evaluate the 
factorial validity or factorial invariancc of a single 
measuring instrument; of these, most have been incomplete in 
terms of model f i tt ing^ procedures and tests of invariance. The 
purpose of the present paper, in broad terms, is to address 
these limitations in a demonstration of LISREL CFA procedures 
for testing the factorial validity and invariance of a single 
measuring instrument. 

LISREL Confirmatory Factor Analysis 
Factor analysis, in general terms, is a statistical 
procedure for determining whether covariation among a set of 
observed variables can be explained by a smaller number of 
latent variables (i.e., factors). In contrast to EFA, where the 
only hypothesis tested concerns the number of factors 
underlying the observed data (Bentler, 1978), CFA permits the 
testing of several hypotheses; the number and degree of 
specificity being determined by the investigator. As such, 
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bn5)ed on his/her knowlodfte of theoretical and empirical 
research, the investigator postulates a priori, a particular 
factor analytic model and then tests the model to determine 
whether or not it is consistent with the observed data; 
minimally, model specifications would include the number of 
latent factors, the pattern of factor loadings, and relations 
among the latent factors. 

The lilSRRL CFA framework incorporates two conceptually 

distinct models a measurement model and a structural model. 

The first of these specifies how the observed (i.e., measured) 
variables relet** to the underlying latent (i.e., unobserved, 
unmeasured) factors; the second specifies relations among the 
latent factors themselves. In LISKF.I notation, this means that, 
typically, the factor loading (lambda. A), error (theta,0) 3nd 
latent factor var iance-covariance (phi,* ) matrices are of 
primary importance. More specifically, A is a natrix of 
coefficients regressed from latent factors to observed 
variables, and ^ is the var i ance-covar i ance matrix of 
error /uniruenesses . These matrices make up the measurement 
aspect of the model.^ * is the factor variance-covariance matrix 

2 

and constitutes the structural part, of the model. Since a 
number of papers are available to readers that (a) specify the 
statistical theory underlying LISRKL CFA (e.g., Joreskog, 1969; 
Long, 1983), (b) outline basic notation and steps in using the 
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LISRRL proRram (e.g., Lomax, 1982; Long, 1983; Wolfle. 1981), 
and (c) summarize advantages of LISRRI.. CFA over traditional RFA 
procedures (e.g.. Long, 1983; Marsh & Hocevar, 1985), these 
details are not provided here. 

The, process of validating the factorial structure of a 
measuring instrument and then testing for its invariance across 
groups involves two separate analytical procedures; the first 
is a prerequisite for the second. The initial step entails the 
estimation of a baseline model; since this procedure involves 
no bctween-group constraints, the data are analyzed separately 
for each group. The baseline model represents the most 
parsimonious, yet substantively meaningful ^nd best-fitting 
model to the data. Since instruments are often group-specific 
in the way they operate, these models are not expected to be 
identical across groups. For example, whereas the baseline 
model for one group might include correlated measurement errors 
and/or secondary factor loadings, this may not be so for the 
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second group. A priori knowledge of such group differences, as 
will be illustrated later, is critical in testing for 
equivalencies across groups. 

Having determined the baseline model for each group, the 
investigator may then proceed to tests of factorial invariance. 
Since these analyses involve the imposition of constraints on 
particular parameters, the data from all groups must be 
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annlv/c<l si mn 1 r onooiis 1 y to obtain officiont oslimatos (JoroskoR 
ft Sorbom, 1985). It is important to note, howcvor, that the 
pattern of fixrd and frco paranor<»rs romains consistent with 
the bnsol ine model spec i f i rat i on for each group. (For a review 
of US^JKI. CFA invaririnrc? testing appMcations, see liyrne, 
Shavelson ft ^liithen . in press; for details of the procedurn in 
Ronoral, see Alwin Jackson, lOMl; Hyrne et al., in press; 
JoreskoR, 1971a; Marsh K Hocevar, 19H5; Rock, Worts R Tlaiipjier, 
1978. 

A review of previous st-idios usinp, CPA hlSPRT procedures to 
validate assessment measures reveals several lijnitations . 
First, with three exceptions (Byrne, in press; Marsh, 1987b; 
Tanaka & Huba, 1984), researchers have not considered alternate 
model specifications beyond the one initially hypothesized (sec 
Benson. 1987; Marsh, 1985, 1987a; Marsh «^ flocevar, 1985; Marsh 
R O'Neill, 198/»; Marsh, Sniilh Barnes, 1985). Tn other words, 
researchers have (a) postulated a model, (b) tested its fit to 
the observed data, (c) arf>ued for the adequacy of mcdel fit, 
and (d) evaluated factorial validity on the basis of this a 
priori model. Such validity claims, however, may be considered 
dubious for at least two roasons: (a) in many cases, model fit 
was only mar(»inally j^ood , an<l (b) these models did not allow 
for sample-specific artifacts such as nonrandom measurement 
error (i.e., correlated error) and/or secondary factor 
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lon(lin(>s, tv/o * f i nd i n^s not uncommon to measures of psycho- 
logical construct?; (see e.«., Byrne, in press; Ryrne 
Shavolson. 1986; Mulia, WinRard, ?. Bentler, 1081; Mewcomb, Muba, 
?i nontler, 198G; Tanaica f- Muba, 198A). More appropriately, 
nioclol fitting should continue beyond the initially hypothesized 
nodel until a statist ical ly, best-fittinn nodol is determined; 
additional analyses can then be conducted to establish which 
parameters are statistically, as well as substantively 
important to the CFA model. In so doinp, , both practical and 
statistical significance are ta'cen into account (Huth^n, 
personal communication, January, 1987; see also, lluba et al., 
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1981; Tanaka i|uba, lO^A). 

While some have criticized such post hoc niodel-f i 1 1 inp 
practices (e.n., Browne, 1982; Fornell, 1083; rjacCaUum, 1987), 
Tanaka and lluba (1984) have argued that the process can be 
substantively noaninrjnl. For example, if the estimates of 
major parameters undergo no appreciable chanp,e when minor 
parameters are added to the model, this i an indication that 
the initially hypothesized nodel is empirically robust; the 
morn fitted model therefore represents a minor improvement to 
nn already adequate model and the additional parameters should 
be deleted from the model • If, on the other hand, the major 
parameters undergo substantial alteration, the exclusion of the 
post hoc parameters may lead to biased estioiates (Alwin u 
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Jnckson, lORO; Jorosko^,, 10M3); t'lo minor pnrnnotors should 

therefore ho rotninoci in the model. 

One method of estiTinLinn the prncticnl s i p,n i f icanc e of post 

hoc parameters is to correlate major parameters (the A 's and 
♦ 's) in the initially hy potlies izrd model with those in the 

hest-f itt inj» post hoc nodel (c.f. flarsh, 1987b). Coefficients 

close to 1.00 arf>ue for the st.ibility of the initial model and 

thus, the triviality of the minor parameters in the post hoc 
model. Tn contrast, coefficients that are not close to 1.00 

(say, <.90) are an indication that the major parameters were 
adversely affected, and thus ar;»nes for the inclusion of the 

post hoc parameters in the final hasel ino model. 

A second limitation of previous research relates to tests 
of factorial invariance. In particular, researchers have 
conducted such tests at the matrix level only; when confronted 
with a noninvariant At or $ , they have not continued testing to 
to determine the aberrant paranetet(s) that contributed to the 
noni nvariance (see Benson, 10^?7; finrsh, lOH^i, 1987b; Marsh ft 
Ilocovar, 19RSt Marsh et al., 108')). Consequently, readers are 
left wit!: the impression thnt j^iven a noninvariant pattern of 
factor loadiiif^s, further testinj; of invariance is unwarranted. 
This conclusion, however, is unfounded when the model 
specification includes multiple indicators of a construct 
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(ihitlicn Christof f orsson , 1081). (For an extended discussion^ 
review of the .1 i tera tiirrs and application, see Pyrne et al., in 
press; for an -^pp 1 icatiow involvinf? dichotomous variables, see 
Miithcn & Christof fersson, lORl), 

Tn e'xanininp, factorial validity, partial measurement 
invariance is inportant because it beais directly on further 
testing of measurement and/or structural equivalencies. For 
example, the researclier may wish to test wh aher the 
theoretical structure of the underlying* contruct is equivalent 
across groups; the invariance of factor covariances, then, is 
of primary interest (see e.g., Flarsh, 1985; Harsh fi llocevar, 
1985). Alternatively, the investigator may be interested in 
testing for the invariance of item or subscale rel iabi 1 i tes ; in 
this case, the invariance of factor variances is of interest 
(see Cole & Maxwell, 1985; Rock et al., 1978). In testing for 
the invariances of factor variances and covariances, equality 
constraints are imposed on only those factor loadings known to 
be invariant across gro«i>s; this may include all, or only a 
portion of the factor loading parameters. 

A final limitation concerns studies that have investigated 
the invariance of item (Benson, 1987; Marsh, 1985, 1987b; Marsh 
& llocevar, 1985; Marsh et al., 1985) or subscale (Byrne ft 
Shavelson, 1987) reliabilities across groups. Three additional 
studies (Corcornn, 1080; Hare « Mason, 1980; Wolfle ft 
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Robortshnw, 19^3) nro roportcd hero for sake of completeness; 
the focus hero, however, wns on the equivalence of response 
error, rather than on specific test item or subscale 
relinbi I i Lies . Kach of these studios tested fo- the invariance 
of meas^urement reliablities by placing, constraints on both tlic X 
and the 0 paraneters. However, this procedure is valid only 
when the factor variances are Itnown to be equivalent across 
Rroiips (Cole ^ Maxwell, 19«5; Hork et al., 1978). When 
variances are nonin var iant , it is necessary to check the ratio 
of true and error variances in testino for the equivalence of 
reliabilities (see Worts, Rock, l*inn, u Joreskog, 1976). 

In sum, four meLho<loloRical weaknesses are evident with 
previous LISURh CFA validity studies of measuring instruments. 
First, model-fitting procedures have been incomplete in the 
determination of adequately specified baseline models. Second, 
testing for partial measurement invariance has not been 
considered. Third, jjiven the failure to test for, and identify 
partially invariant it:?n scaling units, researchers have not 
been able to proceed with testinj; for the invariance of 
structural parameters. Finally, tests for the invariance of 
item (or subscale) reliabilities liave assumed, rather than 
tested for, the equivalency of factor variances. As such, 
testinp, for thn invariance of reliabilities has been 
Incomplete, and in many cases, incorrectly executed. The 
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purpose of this paper is to address these limitations by 
demonstrating the steps involved in: (a) conducttnR a 
sensitivity analysis to determine a ^seline model that is 
statistically best-fitting, yet substantively most meaninsf ul , 
(b) testing for, and testing with partial measurement 
invnriance, and (c) testing for the Invarianco of subscalo and 
item reliabilitips. 

Application of USREL Confirmatory Factor Analyses 

The Measuring Instrument 

The Perceived Competence Scale for Children (Harter, 1982) 
is used here for demonstration purposes. This 28-item 
self-report instrument measures four facets of perceived 
competence: co};nitive competence (i.e., academic abilit-Oi 
physical competence (i.e., athletic ability), social competence 
(i.e., social acceptance by peers), and general self-worth 
(i.e., global self-esteem). Each 7-item subscale has a 4-point 
"structured alternative" question format ranging from not very 
competent (1), to very competent (A), (For a summary of 
psychometric properties, see Byrne Schneider, 1988; Harter, 
1982). 

Data Base 

Data for the present demonstration came from a larger study 
that examined social relation differences tietween gifted 
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students niul Ihoir non-<?iftO(l poors (soc Sclinoidor, Clof>f», 
Byrno, Ledin«hnm, * Cronhio, in press). Following listwiso 
deletion of missing data, the namplo for the present paper 
comprised 2A1 (>rado 5 (129 nornni, 132 p.ifted) and 230 srade « 
(113 normal, 117 niftoci) children from the two public school 
systems in Ottawa, Can.ida. Overall, an examination of item 
skewness and kiirtosis revealed a distribution that was 
approximately normal for oarli ^roiip (see Muthen ft Kaplan, 
1985). (For details concerning, descriptive statistics, 
selection criteria and sampling procodnros, see Ryrno 
,Schneider, lOSH). 
Analysis of the Data 

Analyses am conducted in two major staj^es. First, the 
factorial validity of the PdSC is tested separately for trades 
*> iind 8 in the normal and pjftod samples, and a baseline model 
established for each of the four f>roiips. Second, tests for t!io 
factorial invarianre of item rpS!)onses across j^rade are 
cond..ctcd separately for the nornal and gifted samples. 

Analyses are based on an item-pair structure (with the 
exception of one item in each sul)scn1e). As such, the seven 
items in each subsrale are paired off, with items 1 and 2 
forming the first couplet, items 3 and A the second couplet, 
and items 5 and 6 the third couplet; item remains a 
sinpleton. The decision to use itrm-nairs was based on two 
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prinnry factors-: 'n) the low rntio of nninher of subjects per 
test item for each suhsample, and (b) preliminary RFA results 
derived from single-item analyses indicating, for the most 
part, that items wore reasonably homogeneous in their 
domain-specific measurements of perceived competence (see Byrne 
» Schneider, 1988). Furthermore, Marsh, Barnes, Cairns, & 
Tidman (1984) have argued that the analysis of item-pairs is 
preferable to single items for at least four additional reasons 

item-pair variables are likely to: (a) be more reliable, 

(b) contain loss unique variance since they are less affected 
by the idiosyncratic wordino of individual items, (c) be more 
normally distributed, and (d) yield results having a higher 
degree of goner a 1 i zab i 1 i t y . 

The CFA model in the present study hypothesizes a priori 
that: (a) responses to the PCSC can bo explained by four 
factors, (b) each item-pair (and item singleton) has a non-zero 
loading on the perceivecl competence factor that it is designed 
to measure (i.e., target loading), and zero loadings on all 
other factors (i.e., non-tnrget loadings), (c) the four factors 
are correlated, and (d) error/uniqueness terms for the 
item-pair (and item singleton) variables are uncorrelated . 
Parameter specifications are sumnarized in Table 1. 
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Insert Table 1 nboiit here 



Covnrifincc structyro analysis has traditionally relied on 

2 

the X 'likelihood ratio test ns a criterion for assessing the 
extent to which a proposed model fits the observed data; a 
nonsignificant indicates a well-fitting model. However, the 
sensitivity of the statistic to sample size, as well as to 
various model assumptions (i.e., linearity, mu 1 t inorma 1 i t y , 
additivity) are now well known (see e.g., Rentier & Bonett, 
1980; Fornell, 1983; lluba X Harlow, 1987; Joreskog, 1982; Ma-sh 
& Ilocevar, 1985; Muth^n /I Kaplan, 1985; Tanaka, 1987). As an 
alternative to X t other good^ess-of -f i t indices have been 
proposed (see e.g.. Rentier ^ Bonett, 1980; lloelter, 1983; 
Tanaka & Huba, 1985; Tucker /I Lewis, 1973). Researchers, 
however, have been urged not to judge model fit solely on the 
basis of values (Rentier Ronctt, 1980; Joreskog St Sorbom, 

1985), or on alternative fit indices (Sobol S Bohrnscedt, 
1985); rather, assessments should bo based on multiple 
criteria, including "substantive, theoretical and conceptual 
considerations" (Joreskog, 1971, p. 421; see also, Sobel & 
Bohrnstedt , 1985) . 

^Assessment of model fit in the present example is based on 
(a) the likelihood ratio test, (b) the X^/df ratio, (c) 
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T-valiiof?» normalized residuals and modification indices 
provided by LTSUl.L VT,^and (d) knowledf^e of substantive and 
theoretical research in this area. 
Fittin^i the HaseHne Fiodcl 

Since parameter specifications for the hypothesized 
A-factor model do not include equality constraints between 
various subsanples, all analyses are performed on the observed 
correlation matrix for each ^roup. Results of the model-fitting 
process are reported in Tables 2 and 3 for the normal and 
gifi^.^d samples, respectively. 

Normal sample * As shown in Table 2, the initial model 
(Model 1) represented a fairly reasonable fit to the observed 
data for grade 5 students ( X^/df « 1.55). Nonetheless, an 
examinaton of the modification indices revealed three 
ofr-diap,onal values in the 6 matrix that were p,reater than 5.00 
(see Joreskog fi Sorbom, 1985). These parameters represented 
error covariances between item variables, both within (PSC4, 
PS(:2) and across (PPC4, PSCT; PCCl , P(;S3) suhscales. Such 
findings, as noted earlier, are often encountered with tiodels 
of psychological phenomena, but are particularly evident when 
the model represents items (i.e., observed variables) and 
snhscnle factors (i.e., latent variables) from a single 
measuring instrument (see e.g.* Byrne, in press; Uyrne K 
Shavelson, 1987); error covariances in these instances are 

• 
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considered substantively plausible since they indicate 
nonrandom error introduced by n pnrliculnr measurement method 
such as item format. 



Insert Table 2 about here 



To determine the statistical and practical significance of 
these error covariances, then, model fitting continued with the 
specification of three alternative models (Models 2-A). In each 
model, the error covariance in question was specified as a 
free, rather than as a fixed parameter. Since a difference in X 
(Ax^) for competing (i.e., nested) models is itself x^ — 
distributed with degrees of freedom equal to the difference in 
degrees of freedom, this indicator is used to judge whether the 
reestimated model resulted in a statistically significant 
improvement in fit. Model 4 ultimately yielded the model of 
best fit (X^ = 117.57, p>.()5:xVdf = 1.24) and also 

95 

demonstrated a significant improvement in fit (^X^^ = 8.96, 

_£<.01 . 

Hovever, given the known sensitivity of the statistic 
discussed earlier, some researchers have preferred to look at 
differences between (a) the absolute magnitude of estimates 
(Werts et al., 1976), (b) the magnitude of estimates expressed 
as X^ /df ratios (see e.g.. Marsh ft llocevar, 1985), or (c) the 
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X*/(lf rntios of nested nodels, as a more roalisttc ind«x of 
model Improvoirent (see e.R., Harsh, lOS"), 1937b). An 
ex«imi nation of differences between the ^ /df ratios in the 
prosont data showed values of .11, .12 and .OS (Models 2-4, 
respectively), suqgestins that the impact of the post hoc 
parameters on the specified model was fairly trivial. This 
notion was supported by three additional pieces of evidence. 
First, the error covariance estimates, while statistically 
slp,nificant (T-values > 2.00), were of relatively minor 
i:ia«nitude (mean 0 = .06). Second, visual inspection of the 
factor loadings and factor covarinnces in Models 1 and 4 
revealed little fluctuation in their estimated values. Third, 
the factor loadings in Model 1 were highly correlated with 
those in Model 4 (jr « .^5); likewise, for correlations computed 
between the factor varinnce-covar iances (jr = .99). Since the 
addition of the error covariance parair.eters to the model 
altered neither the m-^asurenent parameters (see Bagozzi , 1983), 
nor the structural parameters (see Fornell, 19M3), their impact 
on the model was cloarly trivial. Tliese results thus verified 
the paraTieter sta!>ility of the initially hypothesized model; 
Model 1 was, therefore, considered as baseline for grade 5 in 
all subsequent analyses. 

The hypothesized A-factor model for grade 8, as shown in 
Table 2, represented a good fit to the data ( X^/df = 1.35). 
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Although an oxnmination of the modification indices sugROStod 
possible model-fit improvenent if error terms between two item 
variables were allowed to covary, the fit differential was not 
statistically significant ( ^X^^s 3.33, £>.n5); Model 1, 
therefore, was considered basiline for the Rrade 8 normal 
sample. 

Gifted sample . Modol-f ittinj? res u Its for the Rifced 
differed substantially from those for their normal peers. These 
results are presented in Table 3. I,ot us look first at the fit 
statistics for grade . We can see that the initially 
hypothesized 4-factor model (Model 1) does not represent a 
particularly Rood fit to the data 93 = 160.43). To 
investigate the misfit, model fitting proceeded as before with 
the normal sample. A substantial drop in was found when item 
PPC4 ( Ax^^= 25.57, j)<.Of)l) and item PGS/* (Ax^^= 17.99, £<.0Ol) 
were free to cross-load on the social (PSC) and cognitive (PCC) 
factors, respectively. 



Insert Table 3 about here 



In contrast to the post hoc error covariances rencountered 
with the normal sample, these parameters represented fairly 
major alterations to the initial 4-factor model and bear 
importantly on the factorial validity of the Darter instrument. 
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Thf? decision to nccept fiodel 3 as Sascline Cor the j>rado 5 
Rifted was based on throe priiiary cons i derations . First,, tho 
secondary loadings of PPC4 on the PSC factor (A ), and PGS4 
on the PCC factor ( ) were both !iiglily significant (T-values 
a A. 97;* 4.09, respoctivoly ) and of fairly hif>h magnitude ( A = 
.61; .6*>, respectively). Second, the factor 1 oad i nf? correlation 
between Models 1 and 3 was .6R, su.TJjrst inf> that the Flodel 1 
measurement estimates were sonrv/hat unstable; the structural 
parameters, on the other hand, appeared to be very stable (jr « 
.99). Finally, the findinp.s wore consistent with an earlier RFA 
of the data which indicated evidence of the same cross-loading 
pattern (see Byrne R Schneider, 1988). 

A review of the model -fittino re suits for ftrade 8 (see 
Table 3) reveals the secondary factor loadings noted earlier, 
to be comnion to both groups of Rifted students. However, a 
well-fit t inR model for the p,rade 8 subsample was realized only 
when two further restrictions on the hypothesized model (Model 
1) were relaxed; these included one error covariance between 
Itom A and Itom-pair 1 on the perceived co{»nitive competence 
subscale (PCC4, PCCl ; Ax^^= 25.74. ji<.001) and one secondary 
factor loadino (P(;S2 on PS(:;Ax^^== 14.14, j)<.001). 

Followinf> these analyses, Model 5 was considered baseline 
for the j»rade 8 v,ifted. As with the previous subsamples, this 
decision was linked to several factors. First, the secondary 
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loadings of PPC4. PGS4 and PGS2 on the PSC, PCC and PSC 
factors, respjectivcly , were statistically significant (T-values 
m 4.74, 4.05, 3,80, respectively); the factor loading estimates 

A 

were also of substantial magnitude ( ^ = .45, ,35, .34, 
respectively). Second, the error covariance estimate, unlike 
those for the normal sample, was highly significant (T-value = 

A 

5.76) and fairly large ( 9 = .43); given the size of this 
estimate, it was considered risky to constrain the parameter to 
zero since this specification could have an important biasing, 
effect on other parameters in the model (Alwin & Jackson, 1980; 
Joreskog, 1983). Third, fluctuation of the factor loading 
estimates, albeit more modest than for grade 5, was evident 
between Models 1 and 5; this instability was verified by a 
correlation of .87 between X parameters in the two models; as 
with the grade 5 findings, the structural parameters were shown 
to be fairly stable (£ » .94). Finally, the cross-loading of 
factors for the grade 8 sample was consistent with findings by 
Byrne and Schneider in the EFA study noted earlier. 
Testing for Invariance 

Tests of invariance involved specifying a model in which 
certain parameters were constrained to be equal across groups 
and then comparing that model with a less restrictive model in 
which these parameters were free to take on any value. As with 
model-fitting, the Ax^ between competing models provided a basis 

a 
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for determining the tenability of the hypothesized equality 
constraints; a significant Ax^ indicating noninvariance . .Unlike 
the model-fitting analyses, however, the simultaneous 
estimation of parameters was based on the covariance, rather 
than on -the correlation matrix for each group (see Joreskog & 

7 

Sorbom, 1985). For purposes of the present demonstration, 
invariance- testing procedures are applied to the gifted sample 
only, since it is the more interesting of the two samples in 
terms of model specification; analyses focus on equivalencies 
across grades 5 and 8. We first test for the equality of item 
scaling units (i.e., ,factor loadings; A*s), components of the 
measurement model. Once we have determined which item pairs 
(and/or single items) are invariant, we can then proceed with 
tests for the equality of subscale (i.e., factor) covariances, 
components of the structural model. Finally, we test for the 
equality of subscale and item reliabilities. 

As noted earlier, once baseline models are determined, any 
discrepancies in parameter specifications across groups remain 
so throughout the analyses. In the present application, for 
example, the secondary loading in the A matrix (A ), and the 

2 3 

error covariance in the © matrix (9 ) for grade 8, remained 

8 S 

unconstrained for all tests of invarinnce. A summary of the 
baseline model parameter estimates for the grades 5 and 8 
gifted are summarized in Tables 4 and 5, respectively. 
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Insert Tables 4 and 5 <nbout here 



Rqualitv of item scaling units . Since the initial 
hypothesis of equality of covariance matrices was rejected 
» 209.81, j)<,001), invnrinnce testing proceeded, first, to 
the equivalence of item scaling units. These results are 
summarized in Table 6. 



Insert Table () nbout here 



The simultaneous 4-factor solution for each Rroup yielded a 
reasonable fit to the data (x^ = 232.08). These results 

19 0 

suggest that for both grades, the data were well described by 

8 

the four perceived competence factors. This finding, however, 
does not necessarily imply that the actual factor loadinos are 
the same across grade. Thus, the hypothesis of an invariant 
pattern of loadings was tested by placing equality constraints 
on all lambda parameters (including the two common secondary 
loadings, ^,.^3"^ ^uo • hut excluding ^ . the secondary factor 

ad^j "f* 23 

specific to grade 8), and then comparing this model (Model 2) 
with Model 1 in which only the number of factors was held 
invariant. The difference in was highly significant (Ax^^^= 

38.93, £<.001); thus, the hypothesis of an equivalent pattern 

I 

^4 



test 
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of scalint; uni-ts was untenable. 

In order to identify which scaling units were noninvariant, 
and thus delect partial measurement invarianco, it seemed 
prudent to first determine whether or not the two common 
secondary loadings were invariant across grade. As such, 
equality constraints were imposed on^^3and^^2, and the model 
reestimated; this hypothesis was found tenable (^X^^. 5.10, 
£>.05). Tests of invariance proceeded next to (a) test each 
congeneric set of scaling units (i.e., parameters specified as 
loading on tbe same factor) and then, given findings of 
noninvariance, to (b) examine the equality of each item scaling 
unit individually. For oxamplo, in testing for the equality of 
all scaling units measuring percoivod general self (PCS), A , A i 

21 31 

„i , as well as le^a^'^nfl ^were held invariant across groups. 
Given that this hypothesis was untenable ( Ax 24.66, j)<.001), 
each factor loading (A , A , A ) was tested independently to 

21 3 1 U ' 

determine whether it was invariant across grade; A and A were 

16,3 ■t2 

also held concomitantly invariant. These analyses detected one 
item scaling unit (PGS2; A^^ ) to be nonirvariant across grade. 

In a similar manner, the scaling units of all remaining 
item pairs (or singletons) were tested for invariance across 
grade. As can be seen in Table 6, invariant factor loadings 
were held cumulatively invariant, thus providing an extremely 
powerful test of factorial invariance. In total, only two item 
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scaling units were found to be noncqui va lent one item pair 

nensurinp, perceived Renernl self (PCiS2; \ ) and one single item 

21 

measuring perceived social competence (PSC4; \ ). 

12,3 

Equality of factor covariances . The first step in testing 
for the^ invariance of structural relations among subscales was 
to constrain all f actor covariances to be equal across grade. 
Equality constraints were subsequently imposed, independently, 
on each of the phi parametero. It is important to note that 
partial measurement invarinnce was maintained throughout these 
testing procedures. In other words, the following measurement 
parameters were held invariant while testing for the equality 
of the factor covariances: the two common secondary factor 
loadings (X ,A ), and all factor loadings except A and A 

16,3 h2 21 12^3 

The hypothesis of equivalent factor covariances was found 

9 

tenable ( « 5.12, j)>.05). If, on the other hand, the 

hypothesis had been found untenable, the researcher would want 
to investigate further, the source of this noninvariance . Thus, 
as demonstrated » ch tests of item scaling units, he/she would 
proceed to test, independently, each factor covariance 
parameter in the matrix; model specification, of course, would 
include the partially invariant measurement parameters. 

Koualitv of rol i a t i 1 it ies . Generally speaking, in 
multiple-indicator CFA models, testing for the invariance of 
reliability is neither necor,sary (Joreskog, 1971b), nor of 
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particular interest when the scales are used merely as CFA 
indicators and not as measures in their own right, ignoring 
reliability (Miithon. personal communication. October. 1987). 
AlthouR:i Joreskog (1971a) demonstrated the steps involved in 
testing for a completely invariant model (i.e.. i n variant A . 
and 0). this procedure is considered an excessively stringent 
test of factorial invariance (Miithon. personal communication. 
January 1987). In fact. Joreskog (1971b) has shown that while 
it is necessary that multiple measures of a latent construct be 
congeneric (i.e.. believed to measure the same construct), they 
need not exhibit invariant variances and error/uniquenesses 
(see also. Alwin & Jackson. 1980). 

When the multiple indicators of a CFA model represent items 
from a single measuring instrument, however, it may be of 
interest to test for the invariance of item reliabilities. For 
example, this procedure was used by Benson (1987) to detect 
evidence of item bias in a scale designed to measure 
self-concept and racial attitudes for samples of white and 
black eighth grade students, and by Munck (1979) to determine 
whether the item reliability of items comprising two 
attitudinal measures were equivalent across different nations. 
In contract to the conceptual definition of item bias generally 
associated v.:th cognitive instruments (i.e.. individuals of 
equal ability have unequal probabiMiy of success), item bias 
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related to affective instruments reflects on its validity, and 
hence» on the question of whether items generate the same 
meaning across groups; evidence of such item bias is a clear 
indication ti:<Tt the scores are differentially valid (Green, 
1075) • • 

In the present example, the invariance of factor variances 
was tested first, in order to establish the viability of 
imposing equality constraints on the A and 6 for each item or 
whether, in light of noneqiiivalent factor variances, invariance 
testing should be based on the ratio of true and error 
variances (see Cole & Maxwell. 1985; Rock et al . . 1978). The 
hypothesis of equivalent factor variances was found tenable 
( iix^^« 5,20, £>,05; see Footnote 10). As such, the reliability 
of each item pair (or singleton) was tested for invariance 
acrosb grade by imposing equality constraints on the respective 
^ and ^ parameters; as with previous tests of item scaling units, 
equally reliable items were held cumulatively invariant 
throughout the testing sequence. These results are summarized 
in Table 7. 



Insert Table 7 about here 



Tests of invariance proceeded, first, by testing for the 
equivalency of each suhscale; only the Perceived Cognitive 
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Competence subscalo (PCC) was found to be equivalent across 
Rrade (Ax^ - 8.49. £>.05). Subsequently, the reliability. 

' 10 

equivalency of each item pair (or singleton) was tested. Had 
tests of invariance revealed the factor variances to be 
nonoquivalent, on the other hand, it would have necessary to 
test for item reliability by examining the ratio of true and 
error score variances ( ) • (for an explanation of this 
procedure, see Munck, 1979; Werts et al., 1976). 

Conclusion 

While the use of LISRRI, CFA procedures is becominR more 
prevalent in construct validity research in general, relatively 
few studies have applied this approach to the validation of 
single measuring instruments, in particular. However, of the 
studies that have used the procedure for testing the factorial 
validity and invariance of a single instrument, most share four 
methodological weaknesses; these relate to the failure: (a) to 
determine an adequately specified baseline model, (b) to test 
for partial measurement invariance, (c) to tost for the 
invariance of structural parameters, given partially invariant 
item scaling units, and (d) to test for the equivalence of 
factor variances prior to testing for the invariance of test 
item reliabilities. 

Tlie present paper addressed these limitations in an 
application to data comprising self-report responses to the 
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llartcr (19S2) Perceived Competerrce Scale for Children by grades 
5 and 8 normal and gifted children. Specifically, the paper 
demonstrated the steps involved in (a) the conduct of 
sensitivity analyses to determine a statistically best fitting, 
yet substantively most meaninp,ful baseline model, (b) testing 
for partial measurejnent invarianco, (c) testing for tbe 
invariance of factor variances and covariances, given partial 
measurement invariance, and (d) testing for the invariance of 
test item and subscale reliabilities. These procedures, 
historically, have received scant attention in the literature. 
Tt is hoped that the present illustration will be helpful in 
providing guidelines to future LTSREL CFA research bearing on 
the construct validity of an assessment instrument. 
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Footnotes 

1. If tosts of factor means are of interest, the measurement 
model would also include the regression intercept (nu,v )• ^ 
vector of constant intercept terms. In the basic CFA model, 
however, variable means are not of interest since they are 
neither structured or explained by the constructs (Bentler, 
107R). 

2. For the same reason as noted in Footnote 1, the gamma (r)f a 
vector of mean estimates, is not included in the structural 
mode 1 • 

3. Secondary loadings arc measurement loadings on more than one 
factor . 

The absolute X^/df ratio value that represents a reasonable 
fit to the data remains a controversial i«;sne. For example, 
Muthen (personal communication, October, 1987) contends that 
a X /df ratio >1.S0 indicates a malfitting model for data 
that are normed to a sample size of lOOf). On the other hand, 
Carmines and Mclver (1981) argue that an acceptable X^/df 
ratio can range as high a., 3.00. Taking a midpoint between 
these two extremes, it seems likely that, with sample sizes 
loss than 1000, a coefficient >2.00 is a fairly good 
indication of model misfit. 
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This post hoc fittinj; procodiiro hns boon rofcr'^ed to ns 
tests for "substantive invarinnco" (Tnnnk.^ u Hiiha, IHMA) and 
as "sonsitivity aunlysos" (Hyrno ot nl., in press). 
Mean skrwncss and kurtosis values were ns follows: nor'inl 
(ftrade 'i, SK = -.A7, '^U = -,70; p^rndo 8, SIC = -.38, KU = 

niftoci (orado *>, SK ^ -.3M, KU = -,50; r>rnde 8, = 
../♦6, KU = .01). 

The reader is advised that if start values wore included in 
the initial input, these will likely need to he increased in 
order to make them conpatihlo with covariancc, rather than 
correlation values. 

Since x^^"^' i^-*' correspondi na dep^rees of freeriom are 
additive, the sun oCx^'s (see Table 6) reflects how well the 
undorlyins factor s.LrucLuro fits the data acrosr- groups. 
Tliis model was compared with one in which all items known to 
be invariant were constrained equal across parade (Model 
12, see Table 6) . 
. Although the PCC subscalo, as a whole, was found to be 

invariant, tests of individual iton parameters revealed the 
first item pair (P(lCl) to ho non i n va r ia n t ; this illustrates 
the possibility of maskinf> information when analyses are 
conducted at tlie more macroscopic subscale level. 
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T*ble 1 

Pattern of LISREL Parameters for Model Fitting 
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PCCl 




0 


0 


0 


0 


Ss 










PCC2 




0 


0 


0 


0 


0 


S6 








PCC3 




0 


0 


0 . 


0 


0 


0 


S7 






PCC4 
PSCl 




0 
0 


0 
0 


0 
0 


0 
0 


0 
0 


0 
0 


0 

u 


Se 

0 


S9 


PSC2 




0 


0 


0 


0 


0 


0 


0 


0 


0 


PfC3 




0 


0 


0 


0 


0 


0 


0 


0 


0 


PPC4 




0 


0 


0 


0 


0 


0 


0 


0 


0 


PPCl 




0 


0 


0 


0 


0 


0 


0 


0 


0 


PPC2 




0 


0 


0 


0 


0 


0 


0 


0 


0 


PPC3 




0 


0 


0 


0 


0 


0 


0 


0 


0 


PPC4 




0 


0 


0 


0 


0 


0 


0 


0 


0 



10,10 
0 5 



0 
0 
0 
0 
0 



11,11 
0 6 



0 
0 
0 
0 



12,12 

° «13,13 

° ° *1A,14. 

° ° ° «15,15 
0 0 0 0 

16 



'Fixed parameter 

X * observed item measures for the Perceived Competence Scale for Children 
(PCSC); ~ ^4 perceived competence subscales (i.e. factors) of he PCSC 

( ■ perceived general self; perceived cognitive competence; ■ perceived 
social competence; ■ perceived physical competence); A^"= factor loading matrix; 
t " factor variance - covariance matrix; 6^^ error varian ^ - covariance matrix. 
PGS1-GS3 " paired items #4/b, 12/16, 20/24 measuring perceived general self 
(PCS); PGS4 • item #28 measuring PCS; PCC1-PCC3 • paired items #1/5, 9/13. 17/21 
measuring perceived cognitive competence (PCC* PCC4 * item #25 measuring PCC; 
PSC1-PSC3 « paired items #2/6, 10/14, 18/22 measuring perceived social 
competence (PSC); PSC4 « item #26 measuring PSC; PPC1-PPC3 ■ paired items #3/7, 
11/15, 19/23 measuring perceived physical competence (PPC); PPC4 " item #27 



Table 2 

Stcpt in Model Fitting for the Normal Sample 
Competing Models X df 



Factorial Validity 

42 



xVdf 



Grade 5 



1 Basic Arfactor model 152.26 98 .00 

2 Model 1 with correlated error 139.45 97 .00 

between PPC4 and PSC3 

3 Model 2 with correlated error 126.53 96 .02 

between PSC4 and PSC2 

4 Model 3 with correlated error 11" 57 95 .06 

between PCCl and PGS3 



12.81*** 1 



12.92*** 1 



8.96 



1.55 
1.44 

1.32 

1.24 



Grade 8 



1 Basic 4-factor model' 132.13 98 .01 

2 Model 1 with correlated error 120.55 97 .05 

between PGS4 and PGS3 



3.33 



1.35 
1.24 



p < .01 " p < .001 

'Final model conbidered as baseline 

PPC4 ■ Item #27 measuring perceived physical competence; PSC3 ■ Paired items #18 
and #22 measuring perceived social competence; PSC4 ■ item #26 measuring 
perceived ^ccial competence; PSC2 ■ Paired items #10 and #14 measuring perceived 
social competence; PCCl * Paired items #1 and #5 measuring perceived cognitive 
competence; PGS3 " Paired items #20 and #24 measuring perceived general self; 
PGS4 ■ item #28 measuring perceived general self. 
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Factorial Valldlcy 

A3 



Table 3 

Stepa in Model Fitting for <">ifted Sample 
Coapering Hodelt 



df 



1 Basic 4-factor aodel 

2 Hodel 1. with PPC4 loading 

on PSC 

3 Hodel 2 with PCS4 loading 

on PCC* 



160.43 
134.86 

116.87 



98 
97 

96 



1 Basic 4-factor model 197.77 98 

2 Model 1 with PPC4 loading 175. i6 97 

on PSC 

3 Model 2 with correlated 149.42 96 

error between PCC4 and PCCl 

4 Model 3 with PCS4 loading 129.35 95 

on PCC 

5 Model 4 with PCS2 loading 115.21 94 

on PSC* 



Ax' 



Adf 



Grade 5 

.00 

.00 25.57*** 1 



.07 



17.99*** 1 



Grade 8 
.00 

.00 22.61*** 1 



.00 



.01 



.07 



25.74*** 1 



20.07*** 1 



14.14*** 1 



X^df 



1.64 
1.39 

1.22 



2.20 
1.81 

1.56 

1.36 

1.23 



***p < .001 

Vinal model considered as baseline 

PSC - perceived social competence factor; PCC « perceived cognitive competence 
factor; PPC4 " item #27 measuring perceived physical competence; PGS4 " item #28 
measuring p ceived general self; PCC4 « item #25 measuring perceived cognitive 
competence; PCCl - Paired items #1 and #5 measuring perceived cognitive 
competence; PCS2 ■ Paired items #12 and #16 measuring perceived general self. 
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Table 4 



Baseline Hodel Parameter 


Estimates for 


Grade 


5 Gifted* 




Measured 




Subscale 


Factors 




Item Variables 


PCS 


PCC 


PSC 


PPC 


brror / unimjcnc»» 


PCSl 


.72 


0 


0 


0 


Aft 


PCS 2 


.85 


0 


0 


0 




PCS 3 


.83 


0 


0 


0 




PGS4 


.22 


i46 


0 


0 


• oz 


PCCl 


0 


.72 


0 


0 


AO 


PCC2 


0 


.69 


0 


0 


• 


PCC3 


0 


.69 


0 


0 


SI 


PCC4 


0 


.73 


0 


0 


A7 


PSCl 


0 


0 


.78 


0 




PSC2 


0 


0 


.66 


0 


SA 
• ^o 


PSC3 


0 


0 


.76 


0 


AO 


PSC4 


0 


0 


.61 


0 


• Ox 


PPCl 


0 


0 


0 


.76 


Al 


PPC2 


0 


0 


0 


.79 


10 
• So 


PPC3 


0 


0 


0 


.82 


.33 


PPC4 


0 


0 


.47 


.30 






Subscale (Factor) Correlations 




PCS 












PCC 


.56 










PSC 


.61 


.42 








PPC 


.31 


.33 


.43 







Factor loadings and factor correlations are presented in standardized form to 
facilitate interpretation. 



^Item variables 1-3 represent the first six items of each subscale, paired 

consecutively; item variable 4 represents t*.e seventh item of each subscale. 
PCS • perceived general self; PCC ■ perceived cognitive competence; PSC • 
perceived social competence; PPC ■ perceived physical competence. 
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Table S 

Baaeline Model Pafameter Eatimates for Grade 8 Gifted* 
Meaiured Subscale Factors 

Item Variables'* PCS PCC PSC PPC Error /Uniqueness 



PGSl 


.88 


0 


0 


U 




PGS2 


.63 


0 


. 28 


A 

Q 


17 
• J' 


PGS3 


.91 


0 


0 


A 

Q 


lit 


PGS4 


CO 

• 58 


; 3U 


0 


A 

Q 


• HO 


PCCl 


0 


QQ 
. 00 


U 


A 

Q 




PCC2 


0 


.66 


0 


0 


.57 


PCC 3 


0 


.65 


0 


0 


.58 


PCC4 


0 


.89 


0 


0 


.21 


PSCl 


0 


0 


.82 


0 


.33 


PSC 2 


0 


0 


.83 


0 


.32 


PSC 3 


0 


0 


.87 


0 


.24 


PSC4 


0 


0 


.55 


0 


.70 


PPCl 


0 


0 


0 


.83 


.31 


PPC2 


0 


0 


0 


.89 


.22 


PPC 3 


0 


0 


0 


.37 


.22 


PPC4 


0 


0 


.37 


.55 


.38 



Subscale (Factor) Correlations 

PCS 

.33 

PSC .43 .16 

PPC .40 .15 .45 



^Factor loadings and factor correlations are presented in standardized form to 

fac ilitate interpretat ion. 
**Item variables 1-3 represent the first six items of each subscale, paired 

consecutively; item variable 4 represents the seventh •tern of each subscale. 
PCS • perceived general self; PCC " perceived cognitive competence; PSC " 
perceived social competence; PPC * perceived physical competence. 

er|c ^6 



Factorial Validity 
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Table 6 



T#stB for Invar lAncfi of Item 


Seal ing 


Unita Acroaa 


Crad# fnr 


thA Ciftffd 

kllV VAAVC\i 


X 


CoBoetinft Model 


A 


df 


Av2 


oat 


1 Four Mrceived 


232.08 


190 






1.22 


factors invariant 












7 Uf%Am\ 1 Ml rh all fartnr 


271 01 




JO • ^ J ^ ^ " 




1 33 

A . 


loadinss invariant' 

9^#^a^a A ai jfc 0 a ▼ a * «» 












3 Modal 1' with 2 comion 


237. 18 


192 


5 10 

J . AV 


2 


1.24 


saconoary loaa in^s 












invar iant 

A ai V ^a a a ^a ai ^ 












4 Model 3 with all PCS 


256. 74 


195 


24.66^* 


5 


1.32 


factor loadinas invariant 

A w w a ^%^^v^a A ai jfc 9 a as v ^a a a ^a as ^ 












S Model 3 vith PGS2 


254.33 


193 


22 2S*** 


3 


1.32 


• > 
Invariant 












6 Model 3 iiith PGS3 


239.47 


193 


7 39 


3 


1.24 

A . 


invar iant 












7 Model 3 with PGS3 


240. 37 


194 




A 
H 


1 24 


PGS4 invariant 












8 Model 7 with all PCC 


244*35 


197 


12.27 


7 


1.24 


factor loadinaa invariant 

^ ^a W ^ a ^a^a A as ca a as v ^a a » ^a as ^ 












9 Model 8 with all PSC 


251.37 


200 


IQ 2Q* 


10 

Aw 


1 28 

A . AO 


factor loadinffa '{nMAr'fan^ 












10 Model 8 with PSC 2 


245. 20 


198 


13 12 

A ^ . A A 


8 


1 24 


invariant 












U Model 8 with PSC2, 


245.45 


199 


13.37 


9 


1.23 


PSC3 invariant 












12 Model 11 vith all PPS 


248.69 


202 


16.61 


12 


1.23 


factor loadings invariant 













*p < .05 < .001 

'including the 2 common secondary factor loadings 
b 

The first item-pair loading for each factor was fixed to 1.0 for purposes of 
statistical identification. PGS • perceived general self; PCC « perceived 
cognitive competence; PSC * perceived social competence; PPC * perceived 
physical competence. 
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Table 7 

Tests for Invariance of Subscale and Item Reliabilities Across Grade for the 
Gifted 

v2 



Co«peting Model 



df Ax 



.2 



Adf 



xVdf 



1. TWO coBMon secondary factor 237.18 
loadings invariant 

^16.3 ^2 
Subscales 

2. PCS subscale Hodel 1 with 269.84 

Hi - \l *11 - *44 
invar iant 

3. PCC subscale Hodel 1 with 245.67 

- and «55 " «88 
invariant 

4. PSC subscale Hodel 1 with 272.83 

*93 " ^12, 3 '99 ' *12,12 
invariant 

5. PPG subscale Hodel 1 with 269.29 

^3.4 - ^6.4 *13.13 - *16,16 
invariant 



192 



1.24 



199 



199 



206 



206 



32.66*** 7 



8.49 



35.65** 14 



32.11** 14 



1.36 



1.23 



1.32 



1.31 



Items 

6. Model 1 with 

h\ «n 

invariant 

7. Model 1 with 

Hi •"'^ *22 
invariant 

8. Model 1 with 

X and 5 
31 33 

Invariant 



241.23 



254.76 



246.48 



193* 



194 



194 



4.05* 



17.58*** 2 



9.30** 



1.25 



1.31 



1.27 
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Table 7 cont'd 



Factorial Validity 

^8 



Conpeting Model 



df 



adf xVdf 



9. Model 1 with 
\l 

invariant 

10. Model 9 with 
>5jand 
invariant 

11. Model 9 with 

^62 *66 
invariant 

12. Model 11 with 

^72 *77 
invariant 

13. Model 12 with 

^82 *88 
invariant 

14. Model 13 with 
X,3 and 6,, 

invariant 

15. Model 14 with 

*10,10 

invariant 

16. Model 13 with 

Hi. 3 *11.11 
invariant 

17. Model 13 with 

^2.3 *12.12 
invariant 

18. Model 13 with 

^3.4 *13.13 
invariant 

19. Model 18 with 

^4,4 *14,14 
invariant 



242.88 



243.36 



244.82 



243.08 



249.19 



249.19 



234.92 



263. 13 



266. 52 



238.14 



266.23 



194 



193* 



196 



198 



200 



20 r 



203 



203 



203 



3.70 



8.38* 



7.64 



7.90 



12.01 



12.01 



17.74 



204* 20.96 



11 



27.97** 13 



29.34** 13 



206 



29.07* 



12 



14 



1.25 



1.26 



1.25 



1.24 



1.25 



1.24 



1.26 



1.29 



1.30 



1.27 



1.29 
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Table 7 cont'd 



Factoilai Validity 

-19 



Competing Model df Ax^ Adf x^/df 

261. OA 206 23.86* lA 1.27 

26A.40 206 27.22* lA 1.28 



*p < .05 **p < .01 ***p < .001 

difference in degrees of freedom equals one due to first loading for each factor 
being fixed tp 1.00. 

PCS = perceived general se ; PCC ^- perceived cognitive competence; PSC = perceived 
social competence; PPC ^ perceived physical competence. 



20. Model 18 with 

^5,4 *15.15 
invariant 

21. Model 12 with 

16,^^ 16,16 
invariant 
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